Skip to content

Self-improvement: automation ledger, fact trust, hook analytics, test isolation#178

Merged
ScriptedAlchemy merged 15 commits into
masterfrom
codex/self-improve-20260701
Jul 2, 2026
Merged

Self-improvement: automation ledger, fact trust, hook analytics, test isolation#178
ScriptedAlchemy merged 15 commits into
masterfrom
codex/self-improve-20260701

Conversation

@ScriptedAlchemy

Copy link
Copy Markdown
Owner

Summary

Findings from a TraceDecay self-audit (session transcript mining + doctor + automation run/fact-proposal logs), fixed in one pass:

  • Fact-proposal trust rejected wholesale: the session reflector prompt didn't say trust must be numeric, so models emitted "trust": "high" and the validator rejected every proposal. The validator now accepts low/medium/high bucket labels (mapped to 0.15/0.5/0.85) and the prompt states the numeric requirement.
  • Run-ledger spam: every ~30s scheduler tick appended 3 skipped / scheduler_interval_not_elapsed records (1500+ noise rows). Consecutive identical scheduler skips per task now persist once; manual-trigger skips and reason/task transitions still persist.
  • Corrupt hook_analytics.jsonl: concurrent hook processes raced a read-modify-rewrite append, merging/dropping lines. Appends now use a single O_APPEND write (PrivateStoreIo::append_line).
  • Test pollution of the real profile store: branch_db_safety_test.rs ran against the developer's ~/.tracedecay, leaving 111 corrupt branch-meta.json files and ~7k stale registry rows (now repaired locally). The suite now runs under an isolated throwaway home (IsolatedEnv + TraceDecayStorageEnvGuard).

Follow-up commits on this branch (in progress): daemon SIGTERM/SEGV investigation and an automatic post-update health pass for tracedecay update.

Test plan

  • cargo fmt --all -- --check, cargo clippy --all-targets -- -D warnings
  • cargo test --test automation_session_reflector_runner_test (7/7)
  • cargo test --test branch_db_safety_test (5/5, verified no writes to the real ~/.tracedecay during the run)
  • cargo test --lib automation::lifecycle (4/4, incl. 3 new skip-dedupe tests) and cargo test --lib hooks (36/36)

…nd test isolation

- accept low/medium/high bucket trust labels in session reflector fact
  proposals and clarify the numeric-trust prompt instruction
- stop persisting consecutive identical scheduler-skip run records that
  flooded the automation run ledger every tick
- append hook_analytics.jsonl lines via a single O_APPEND write so
  concurrent hook processes no longer corrupt or drop entries
- isolate branch_db_safety_test under a throwaway profile home so it
  stops writing corrupt branch-meta.json and stale registry rows into
  the real ~/.tracedecay store
@changeset-bot

changeset-bot Bot commented Jul 1, 2026

Copy link
Copy Markdown

⚠️ No Changeset found

Latest commit: 0491064

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

ScriptedAlchemy added 2 commits July 1, 2026 23:48
Graceful shutdown persists token counters and checkpoints WALs for every
live project server sequentially, which can exceed systemd's stop timeout
and end in a SIGKILL mid-checkpoint. Cap shutdown work at 45s, log the
timeout outcome, and abort the stalled task; SQLite WAL keeps remaining
state crash-safe.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: deb256e427

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread tests/branch_db_safety_test.rs Outdated
Comment on lines +23 to +24
_env_lock: tokio::sync::MutexGuard<'static, ()>,
storage: TraceDecayStorageEnvGuard,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep the env lock until storage guards drop

When an IsolatedEnv is dropped, struct fields are dropped in declaration order, so _env_lock is released before storage restores HOME, TRACEDECAY_DATA_DIR, and the global DB override. If another test in this binary is waiting, it can acquire the lock and install its own isolated env while this guard's TraceDecayStorageEnvGuard then restores the old values over it, defeating the isolation this helper is meant to provide. Declare the lock after storage (or add a custom Drop) so it is dropped last.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up: the IsolatedEnv field order was fixed on master in 822d921, and the same drop-order hazard in the mcp-suite fixtures (TestProject, TestEnv, CrossProjectMemoryEnv) is fixed in #198.

ScriptedAlchemy added 9 commits July 1, 2026 23:57
…20260701

# Conflicts:
#	src/automation/runner.rs
#	tests/automation_session_reflector_runner_test.rs
After refreshing the binary, plugins, and daemon, `tracedecay update` now
re-execs a post-update health pass: applies idempotent global-DB schema
migrations, quarantines corrupt branch-meta.json files as
branch-meta.json.corrupt-<timestamp>, purges stale registry rows under the
system temp dir, and summarizes remaining doctor findings. The pass is
failure-tolerant (warnings, never update failure) and skippable with
--no-heal.
branch_meta now owns the one canonical parse used by both load_branch_meta
and the post-update heal quarantine, so schema-corrupt files (valid JSON,
wrong shape) are quarantined instead of warning on every open. Restructures
the health pass into compute/render, fetches the registry once, makes
stale_code_projects borrow with a named StaleRootScope predicate, adds a
shared 0o600-at-create private-open helper in PrivateStoreIo, and documents
the heal-by-default policy. Adds unit + integration tests for the
schema-corrupt quarantine path.
The scheduler gate now loads the run ledger once and threads the records
through the run context, so gate-level and post-gate skip dedup share that
one read and append_skipped_record is a pure append-unless-repeat with no
second I/O pass. Also inlines tokio::time::timeout for the daemon shutdown
deadline (a panic in shutdown_all no longer reads as success) and derives
the session-reflector trust-label representatives from named memory::trust
constants with a drift-guard test.
Moves the update/post-update wiring (plugin refresh, daemon refresh,
subprocess re-exec, health pass) into src/update_cmd.rs following the
*_cmd convention, bringing main.rs to 871 lines. Also promotes the
branch-DB tests' IsolatedEnv into tests/common as the canonical
env-isolation helper.
@ScriptedAlchemy ScriptedAlchemy merged commit a431724 into master Jul 2, 2026
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant